Exploring Topic-language Preferences in Multilingual Swahili Information Retrieval in Tanzania

نویسندگان

چکیده

Habitual switching of languages is a common behaviour among polyglots when searching for information on the Web. Studies in retrieval (IR) and multilingual (MLIR) suggest that part reason such regular topic search. Unlike survey-based studies, this study uses query click-through logs. It exploits querying results selection Swahili MLIR system users to explore how search (query) associated with language preferences—topic-language preferences. This article based carefully controlled using Swahili-speaking Web Tanzania who interacted guided engine. From statistical analysis queries logs, it was revealed preferences may be topics The also are not static; they vary along course from selection. In most topics, either had significantly no preference or preferred Kiswahili changed their English selecting/clicking results. findings might provide researchers more insights developing better systems support certain types scenarios.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Topic-based Language Models for Effective Web Information Retrieval

The main obstacle for providing focused search is the relative opaqueness of search request—searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can leads to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topica...

متن کامل

Cross-Language Information Retrieval in a Multilingual Legal Domain

We describe here the application of a cross-language information retrieval technique based on similarity thesauri in the domain of Swiss law. We present the theory of similarity thesauri, which are information structures deerived from corpora, and show how they can be used for cross-language retrieval. We also discuss the collections of Swiss legal documents and show how we have used them to co...

متن کامل

Experiments in Multilingual Information Retrieval

The multilingual information retrieval system of the future will need to be able to retrieve documents across language boundaries. This extension of the classical IR problem is particularly challenging, as signiicant resources are required to perform query translation. At Xerox, we are working to build a multilingual IR system and conducting a series of experiments to understand what factors ar...

متن کامل

Expanding a multilingual media monitoring and information extraction tool to a new language: Swahili

The Europe Media Monitor (EMM) family of applications is a set of multilingual tools that gather, cluster and classify news in currently fifty languages and that extract named entities and quotations (reported speech) from twenty languages. In this paper, we describe the recent effort of adding the African Bantu language Swahili to EMM. EMM is designed in an entirely modular way, allowing plugg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Asian and Low-Resource Language Information Processing

سال: 2021

ISSN: ['2375-4699', '2375-4702']

DOI: https://doi.org/10.1145/3458671